rl controller
A Learning-based Control Methodology for Transitioning VTOL UAVs
Lin, Zexin, Zhong, Yebin, Wan, Hanwen, Cheng, Jiu, Sun, Zhenglong, Ji, Xiaoqiang
Transition control poses a critical challenge in Vertical Take-Off and Landing Unmanned Aerial Vehicle (VTOL UAV) development due to the tilting rotor mechanism, which shifts the center of gravity and thrust direction during transitions. Current control methods' decoupled control of altitude and position leads to significant vibration, and limits interaction consideration and adaptability. In this study, we propose a novel coupled transition control methodology based on reinforcement learning (RL) driven controller. Besides, contrasting to the conventional phase-transition approach, the ST3M method demonstrates a new perspective by treating cruise mode as a special case of hover. We validate the feasibility of applying our method in simulation and real-world environments, demonstrating efficient controller development and migration while accurately controlling UAV position and attitude, exhibiting outstanding trajectory tracking and reduced vibrations during the transition process.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- North America > Canada > British Columbia > Vancouver (0.05)
- Europe > Spain > Galicia > Madrid (0.04)
- (3 more...)
42cd63cb189c30ed03e42ce2c069566c-AuthorFeedback.pdf
We sincerely thank all reviewers for their constructive comments. We hope this would shed some light on a better understanding of parameter sharing in NAS. We sincerely appreciate your recognition of our technical contributions. (Line 181). Meanwhile, as you pointed out, different optimization of APS would be interesting to explore in the future.
Plasma Shape Control via Zero-shot Generative Reinforcement Learning
Wu, Niannian, Li, Rongpeng, Yang, Zongyu, Xiao, Yong, Wei, Ning, Chen, Yihang, Li, Bo, Zhao, Zhifeng, Zhong, Wulyu
Traditional PID controllers have limited adaptability for plasma shape control, and task-specific reinforcement learning (RL) methods suffer from limited generalization and the need for repetitive retraining. To overcome these challenges, this paper proposes a novel framework for developing a versatile, zero-shot control policy from a large-scale offline dataset of historical PID-controlled discharges. Our approach synergistically combines Generative Adversarial Imitation Learning (GAIL) with Hilbert space representation learning to achieve dual objectives: mimicking the stable operational style of the PID data and constructing a geometrically structured latent space for efficient, goal-directed control. The resulting foundation policy can be deployed for diverse trajectory tracking tasks in a zero-shot manner without any task-specific fine-tuning. Evaluations on the HL-3 tokamak simulator demonstrate that the policy excels at precisely and stably tracking reference trajectories for key shape parameters across a range of plasma scenarios. This work presents a viable pathway toward developing highly flexible and data-efficient intelligent control systems for future fusion reactors.
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Asia > China > Sichuan Province > Chengdu (0.04)
Safe Reinforcement Learning-Based Vibration Control: Overcoming Training Risks with LQR Guidance
Thorat, Rohan Vitthal, Singh, Juhi, Nayek, Rajdip
Structural vibrations induced by external excitations pose significant risks, including safety hazards for occupants, structural damage, and increased maintenance costs. While conventional model-based control strategies, such as Linear Quadratic Regulator (LQR), effectively mitigate vibrations, their reliance on accurate system models necessitates tedious system identification. This tedious system identification process can be avoided by using a model-free Reinforcement learning (RL) method. RL controllers derive their policies solely from observed structural behaviour, eliminating the requirement for an explicit structural model. For an RL controller to be truly model-free, its training must occur on the actual physical system rather than in simulation. However, during this training phase, the RL controller lacks prior knowledge and it exerts control force on the structure randomly, which can potentially harm the structure. To mitigate this risk, we propose guiding the RL controller using a Linear Quadratic Regulator (LQR) controller. While LQR control typically relies on an accurate structural model for optimal performance, our observations indicate that even an LQR controller based on an entirely incorrect model outperforms the uncontrolled scenario. Motivated by this finding, we introduce a hybrid control framework that integrates both LQR and RL controllers. In this approach, the LQR policy is derived from a randomly selected model and its parameters. As this LQR policy does not require knowledge of the true or an approximate structural model the overall framework remains model-free. This hybrid approach eliminates dependency on explicit system models while minimizing exploration risks inherent in naive RL implementations. As per our knowledge, this is the first study to address the critical training safety challenge of RL-based vibration control and provide a validated solution.
Appendix
A full workflow of channel number search with the proposed transitionary APS is shown in Algorithm 1. The overall procedure consists of two stages. To prove Theorem 3.1, we first show the case of two candidate decision Finally, to prove Theorem 3.1, we only need to extend Lemma 1 to the case of multiple candidate decisions. Here we cover more details in the design. We adopt policy gradient to maximize the reward function.
42cd63cb189c30ed03e42ce2c069566c-AuthorFeedback.pdf
We sincerely thank all reviewers for their constructive comments. We hope this would shed some light on a better understanding of parameter sharing in NAS. We sincerely appreciate your recognition of our technical contributions. (Line 181). Meanwhile, as you pointed out, different optimization of APS would be interesting to explore in the future.
Reinforcement Learning Based Traffic Signal Design to Minimize Queue Lengths
Nandakumar, Anirud, Banerjee, Chayan, Vanajakshi, Lelitha Devi
Abstract--Efficient traffic signal control (TSC) is crucial for reducing congestion, travel delays, pollution, and for ensuring road safety. Traditional approaches, such as fixed signal control and actuated control, often struggle to handle dynamic traffic patterns. In this study, we propose a novel adaptive TSC framework that leverages Reinforcement Learning (RL), using the Proximal Policy Optimization (PPO) algorithm, to minimize total queue lengths across all signal phases. The challenge of efficiently representing highly stochastic traffic conditions for an RL controller is addressed through multiple state representations, including an expanded state space, an autoencoder representation, and a K-Planes-inspired representation. The proposed algorithm has been implemented using the Simulation of Urban Mobility (SUMO) traffic simulator and demonstrates superior performance over both traditional methods and other conventional RL-based approaches in reducing queue lengths. The best performing configuration achieves an approximately 29% reduction in average queue lengths compared to the traditional Webster method. Furthermore, comparative evaluation of alternative reward formulations demonstrates the effectiveness of the proposed queue-based approach, showcasing the potential for scalable and adaptive urban traffic management. I. INTRODUCTION Traffic signal control (TSC) is a crucial problem that needs to be addressed to manage traffic flows, ensure road safety, reduce delays, and increase efficiency and social benefits.
- North America > United States (0.14)
- Oceania > Australia > Queensland > Brisbane (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Asia > India (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)